Scope completion truncation to active provider by pntech20 · Pull Request #49 · WilliamAGH/java-chat

pntech20 · 2026-05-17T02:25:35Z

Summary

scopes completion prompt truncation to the provider selected for the current request attempt
moves completion truncation into buildCompletionRequest so fallback providers use their own model limits
adds regression coverage for OpenAI gpt-4o alongside the default GitHub Models openai/gpt-5

Fixes #40.

Verification

JAVA_HOME=C:\Users\Admin\AppData\Local\Codex\jdks\temurin-25\jdk-25.0.3+9 ./gradlew.bat test --tests com.williamcallahan.javachat.service.OpenAiRequestFactoryTest
JAVA_HOME=C:\Users\Admin\AppData\Local\Codex\jdks\temurin-25\jdk-25.0.3+9 ./gradlew.bat test --tests com.williamcallahan.javachat.service.OpenAIStreamingServiceTest
JAVA_HOME=C:\Users\Admin\AppData\Local\Codex\jdks\temurin-25\jdk-25.0.3+9 ./gradlew.bat spotlessCheck
git diff --check

Full ./gradlew.bat test runs 244 tests but fails on the existing Windows environment because EnvironmentVariablePrecedenceTest hardcodes /bin/bash, which is not present here.

Greptile Summary

This PR fixes issue #40 by moving prompt truncation inside buildCompletionRequest so that each provider attempt uses its own model's token limits, rather than applying a single pre-loop truncation derived from the union of both providers' characteristics.

OpenAIStreamingService.complete() now passes the raw prompt to buildCompletionRequest, which calls the private truncatePromptForCompletion(String, String) with the resolved model ID for the active provider.
OpenAiRequestFactory gains a public provider-arg overload of truncatePromptForCompletion and a private model-ID overload that contains the actual logic; the original no-arg public method now delegates to the OPENAI provider by default.
Two new tests verify that gpt-4o (OpenAI) leaves an ~8K-token prompt untouched while openai/gpt-5 (GitHub Models) truncates it with the appropriate notice.

Confidence Score: 4/5

Safe to merge; the core logic change is correct and well-tested for the targeted scenarios.

The refactor correctly scopes truncation to the active provider on each attempt. Two dead public overloads have no production callers after the change, and the o-series 7K limit assumption silently under-serves larger-context models. Neither issue affects correctness for the currently configured models.

OpenAiRequestFactory.java — the dead public overloads and the o-series token limit assumption are worth a second look before this code path grows further.

Important Files Changed

Filename	Overview
src/main/java/com/williamcallahan/javachat/service/OpenAIStreamingService.java	Removes pre-loop truncation and passes the raw prompt to buildCompletionRequest; correct and minimal change.
src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java	Moves per-provider truncation into buildCompletionRequest; introduces two public overloads that have no production callers after the refactor (dead API), and the o-series branch still applies GPT-5's 7K limit.
src/test/java/com/williamcallahan/javachat/service/OpenAiRequestFactoryTest.java	Adds two tests for the new public provider-arg overload; integration between buildCompletionRequest and truncation for GitHub Models provider is not directly covered.

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant S as OpenAIStreamingService
    participant F as OpenAiRequestFactory
    participant P as ProviderRoutingService

    C->>S: complete(prompt, temperature)
    S->>P: selectAvailableProviderCandidates(...)
    P-->>S: [providerCandidate, ...]

    loop for each providerCandidate
        S->>F: buildCompletionRequest(prompt, temperature, activeProvider)
        Note over F: normalizedModelId(useGitHubModels)
        F->>F: truncatePromptForCompletion(prompt, modelId)
        Note over F: gpt5Family / reasoningModel check
        F-->>S: ResponseCreateParams (with truncated prompt)
        S->>P: client.responses().create(requestParameters)
        alt success
            P-->>S: Response
            S-->>C: Mono.just(text)
        else RuntimeException
            S->>P: recordProviderFailure(...)
            Note over S: fallback to next provider if eligible
        end
    end

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java:127-129
**Dead public API after this refactor**

The no-arg `truncatePromptForCompletion(String prompt)` no longer has any production caller — its only call site in `OpenAIStreamingService.complete()` was removed by this PR, and `buildCompletionRequest` now invokes the private `truncatePromptForCompletion(String, String)` directly. Keeping the method conflicts with [AB1d] ("Delete unused code instead of keeping it 'just in case'") and [RC1b] ("No compatibility shims that hide defects"). The same applies to the new two-arg public overload at line 138, which is also reachable only from tests — `buildCompletionRequest` bypasses it and calls the private method itself.

Consider removing both public overloads and testing via `buildCompletionRequest` directly, which exercises the full provider-to-model-id path, or make the provider-arg overload the single public entry point.

### Issue 2 of 2
src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java:149-152
**`o`-series models receive GPT-5's 7K token limit regardless of context window**

`canonicalModelName(modelId).startsWith("o")` captures `o1`, `o3`, `o3-mini`, etc. and routes them to `MAX_TOKENS_GPT5_INPUT` (7 000 tokens). Many o-series models expose far larger context windows and this mismatch silently truncates prompts that would fit. This was also true before the PR, but the refactor now makes this path the single authoritative one for both providers, so the blast radius is wider. At minimum the assumption should be documented; at best, o-series models should have their own named constant and explicit limit.

_{Reviews (1): Last reviewed commit: "Scope completion truncation to active pr..." | Re-trigger Greptile}

Greptile also left 2 inline comments on this PR.

Context used:

Context used - AGENTS.md (source)
Context used - CLAUDE.md (source)

coderabbitai · 2026-05-17T02:25:48Z

📝 Walkthrough

Walkthrough

This PR refactors prompt truncation in the OpenAI completion flow to use only the selected model's token limits instead of applying the most restrictive limit across all configured providers. The truncation API now accepts a provider parameter, the internal logic is simplified to check only the resolved model, and callers delegate truncation responsibility to the request factory.

Changes

Provider-Aware Prompt Truncation

Layer / File(s)	Summary
Prompt truncation API refactoring `src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java`	New public overload `truncatePromptForCompletion(String prompt, RateLimitService.ApiProvider provider)` resolves the model for the given provider and applies truncation. The original single-argument method delegates to this new overload with `OPENAI` as default. Core truncation logic now derives `gpt5Family` and `reasoningModel` from only the selected `modelId`, removing prior aggregation across both configured providers.
Truncation integration in request factory `src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java`	`buildCompletionRequest` now truncates the completion prompt internally using the provider-specific overload before constructing `ResponseCreateParams`, shifting responsibility from the caller.
Streaming service caller update `src/main/java/com/williamcallahan/javachat/service/OpenAIStreamingService.java`	`complete(...)` removes local prompt truncation and passes the original prompt to `buildCompletionRequest`, which now handles truncation based on the resolved model.
Prompt truncation test coverage `src/test/java/com/williamcallahan/javachat/service/OpenAiRequestFactoryTest.java`	Two new unit tests verify that `OPENAI` provider applies no truncation (respecting higher OpenAI limits) and `GITHUB_MODELS` provider truncates to the 8K GPT-5 input limit with appropriate notice.

🎯 2 (Simple) | ⏱️ ~12 minutes

🎯 From limits mixed with care,
A prompt now flows without compare—
Each model gets its perfect slot,
No more context lost to what it's not! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Scope completion truncation to active provider' clearly summarizes the main change of limiting prompt truncation to the currently selected provider.
Linked Issues check	✅ Passed	The changes directly address issue `#40` by moving truncation logic into buildCompletionRequest to scope it to the resolved provider, matching the recommended fix and expected behavior.
Out of Scope Changes check	✅ Passed	All code changes are tightly scoped to addressing the provider-specific truncation bug: modifying OpenAIStreamingService, enhancing OpenAiRequestFactory with a new overload, and adding targeted regression tests.
Description check	✅ Passed	The PR description clearly relates to the changeset, explaining the truncation scope changes and referencing the fixes being made.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-05-17T02:29:55Z

    public String truncatePromptForCompletion(String prompt) {
+        return truncatePromptForCompletion(prompt, RateLimitService.ApiProvider.OPENAI);
+    }


Dead public API after this refactor

The no-arg truncatePromptForCompletion(String prompt) no longer has any production caller — its only call site in OpenAIStreamingService.complete() was removed by this PR, and buildCompletionRequest now invokes the private truncatePromptForCompletion(String, String) directly. Keeping the method conflicts with [AB1d] ("Delete unused code instead of keeping it 'just in case'") and [RC1b] ("No compatibility shims that hide defects"). The same applies to the new two-arg public overload at line 138, which is also reachable only from tests — buildCompletionRequest bypasses it and calls the private method itself.

Consider removing both public overloads and testing via buildCompletionRequest directly, which exercises the full provider-to-model-id path, or make the provider-arg overload the single public entry point.

Context Used: AGENTS.md (source)

Prompt To Fix With AI

This is a comment left during a code review. Path: src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java Line: 127-129 Comment: **Dead public API after this refactor** The no-arg `truncatePromptForCompletion(String prompt)` no longer has any production caller — its only call site in `OpenAIStreamingService.complete()` was removed by this PR, and `buildCompletionRequest` now invokes the private `truncatePromptForCompletion(String, String)` directly. Keeping the method conflicts with [AB1d] ("Delete unused code instead of keeping it 'just in case'") and [RC1b] ("No compatibility shims that hide defects"). The same applies to the new two-arg public overload at line 138, which is also reachable only from tests — `buildCompletionRequest` bypasses it and calls the private method itself. Consider removing both public overloads and testing via `buildCompletionRequest` directly, which exercises the full provider-to-model-id path, or make the provider-arg overload the single public entry point. **Context Used:** AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=c73518f6-94f2-4eb4-a597-3be5ff49a896)) How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-17T02:29:56Z

+        boolean gpt5Family = isGpt5Family(modelId);
+        boolean reasoningModel = gpt5Family || canonicalModelName(modelId).startsWith("o");

        int tokenLimit = reasoningModel ? MAX_TOKENS_GPT5_INPUT : MAX_TOKENS_DEFAULT_INPUT;


o-series models receive GPT-5's 7K token limit regardless of context window

canonicalModelName(modelId).startsWith("o") captures o1, o3, o3-mini, etc. and routes them to MAX_TOKENS_GPT5_INPUT (7 000 tokens). Many o-series models expose far larger context windows and this mismatch silently truncates prompts that would fit. This was also true before the PR, but the refactor now makes this path the single authoritative one for both providers, so the blast radius is wider. At minimum the assumption should be documented; at best, o-series models should have their own named constant and explicit limit.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/main/java/com/williamcallahan/javachat/service/OpenAiRequestFactory.java Line: 149-152 Comment: **`o`-series models receive GPT-5's 7K token limit regardless of context window** `canonicalModelName(modelId).startsWith("o")` captures `o1`, `o3`, `o3-mini`, etc. and routes them to `MAX_TOKENS_GPT5_INPUT` (7 000 tokens). Many o-series models expose far larger context windows and this mismatch silently truncates prompts that would fit. This was also true before the PR, but the refactor now makes this path the single authoritative one for both providers, so the blast radius is wider. At minimum the assumption should be documented; at best, o-series models should have their own named constant and explicit limit. How can I resolve this? If you propose a fix, please make it concise.

Scope completion truncation to active provider

146bfb1

greptile-apps Bot reviewed May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scope completion truncation to active provider#49

Scope completion truncation to active provider#49
pntech20 wants to merge 1 commit into
WilliamAGH:devfrom
pntech20:codex/provider-scoped-truncation

pntech20 commented May 17, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

coderabbitai Bot commented May 17, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot May 17, 2026

Uh oh!

greptile-apps Bot May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pntech20 commented May 17, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pntech20 commented May 17, 2026 •

edited by greptile-apps Bot

Loading

coderabbitai Bot commented May 17, 2026 •

edited

Loading